how to read pdf in python

41

how to read pdf in python -

# importing required modules
import PyPDF2
 
# creating a pdf file object
pdfFileObj = open('example.pdf', 'rb')
 
# creating a pdf reader object
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
 
# printing number of pages in pdf file
print(pdfReader.numPages)
 
# creating a page object
pageObj = pdfReader.getPage(0)
 
# extracting text from page
print(pageObj.extractText())
 
# closing the pdf file object
pdfFileObj.close()

read pdf py -

import textract
text = textract.process('path/to/pdf/file', method='pdfminer')

Comments

Submit
0 Comments